A Multilevel Approach to Topology-Aware Collective Operations in Computational Grids

نویسندگان

  • Nicholas T. Karonis
  • Bronis R. de Supinski
  • Ian T. Foster
  • William Gropp
  • Ewing L. Lusk
چکیده

1 The efficient implementation of collective communication operations has received much attention. Initial efforts produced " optimal " trees based on network communication models that assumed equal point-to-point latencies between any two processes. This assumption is violated in most practical settings, however, particularly in heterogeneous systems such as clusters of SMPs and wide-area " computational Grids, " with the result that collective operations perform suboptimally. In response, more recent work has focused on creating topology-aware trees for collective operations that minimize communication across slower channels (e.g., a wide-area network). While these efforts have significant communication benefits, they all limit their view of the network to only two layers. We present a strategy based upon a multilayer view of the network. By creating multilevel topology-aware trees we take advantage of communication cost differences at every level in the network. We used this strategy to implement topology-aware versions of several MPI collective operations in MPICH-G2, the Globus Toolkit T M-enabled version of the popular MPICH implementation of the MPI standard. Using information about topology provided by MPICH-G2, we construct these multilevel topology-aware trees automatically during execution. We present results demonstrating the advantages of our multilevel approach by comparing it to the default (topology-unaware) implementation provided by MPICH and a topology-aware two-layer implementation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploiting Hierarchy in Parallel Computer Networks to Optimize Collective Operation Performance

The eÆcient implementation of collective communication operations has received much attention. Initial e orts modeled network communication and produced \optimal" trees based on those models. However, the models used by these initial e orts assumed equal point-to-point latencies between any two processes. This assumption is violated in heterogeneous systems such as clusters of SMPs and wide-are...

متن کامل

TOPOLOGY OPTIMIZATION OF DOUBLE LAYER GRIDS FOR EARTHQUAKE LOADS USING A TWO-STAGE ESO-ACO METHOD

A two-stage optimization method is presented by employing the evolutionary structural optimization (ESO) and ant colony optimization (ACO), which is called ESO-ACO method. To implement ESO-ACO, size optimization is performed using ESO, first. Then, the outcomes of ESO are employed to enhance ACO. In optimization process, the weight of double layer grid is minimized under various constraints whi...

متن کامل

Construction of Hexahedral Block Topology and its Decomposition to Generate Initial Tetrahedral Grids for Aerodynamic Applications

Making an initial tetrahedral grid for complex geometry can be a tedious and time consuming task. This paper describes a novel procedure for generation of starting tetrahedral cells using hexahedral block topology. Hexahedral blocks are arranged around an aerodynamic body to form a flow domain. Each of the hexahedral blocks is then decomposed into six tetrahedral elements to obtain an initial t...

متن کامل

Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach

Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...

متن کامل

MPI Applications on Grids: A Topology Aware Approach

Porting on grids complex MPI applications involving collective communications requires significant program modification, usually dedicated to a single grid structure. The difficulty comes from the mismatch between programs organizations and grid structures: 1) large grids are hierarchical structures aggregating parallel machines through an interconnection network, decided at runtime and 2) the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره cs.DC/0206038  شماره 

صفحات  -

تاریخ انتشار 2002